Skip to content

[AArch64][SVE] Remove isSVECC() in favour of changing the calling convention #152742

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

MacDue
Copy link
Member

@MacDue MacDue commented Aug 8, 2025

We essentially had two different ways of setting the calling convention for a function: the proper calling convention and a target flag to override the calling convention. Removing the flag in favour of setting the calling convention is an NFC except in some (somewhat unsupported) cases.

The cases that needed changes were:

  1. Removing the "SVE_VectorCall is unsupported on Darwin" failure
    • The previous method of setting the SVE calling convention means that it has been possible to use the SVE CC on Darwin for a while now (and some tests depend on that).
    • It is unclear if this really should be allowed or not (but does pose a question for SME functions on Darwin that have SVE params)
  2. Removing a Windows SVE test that used varargs and SVE arguments
    • This is not supported (and already fails LowerCall)
  3. Updating the checks for a Windows + exceptions SVE test

Note: The updated logic intends to closely match the code in AArch64TargetLowering::LowerCall (which also changes the CC).

…vention

We essentially had two different ways of setting the calling convention
for a function: the proper calling convention and a target flag to
override the calling convention. Removing the flag in favour of setting
the calling convention is an NFC except in some (somewhat unsupported)
cases.

The cases that needed changes were:

1. Removing the "SVE_VectorCall is unsupported on Darwin" failure
  - The previous method of setting the SVE calling convention means that
    it has been possible to use the SVE CC on Darwin for a while now
    (and some tests depend on that).
  - It is unclear if this really should be allowed or not (but does
    pose a question for SME functions on Darwin that have SVE params)
2. Removing a Windows SVE test that used varargs and SVE arguments
 - This is not supported (and already fails LowerCall)
3. Updating the checks for a Windows + exceptions SVE test

Note: The updated logic intends to closely match the code in
`AArch64TargetLowering::LowerCall` (which also changes the CC).
@llvmbot
Copy link
Member

llvmbot commented Aug 8, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Benjamin Maxwell (MacDue)

Changes

We essentially had two different ways of setting the calling convention for a function: the proper calling convention and a target flag to override the calling convention. Removing the flag in favour of setting the calling convention is an NFC except in some (somewhat unsupported) cases.

The cases that needed changes were:

  1. Removing the "SVE_VectorCall is unsupported on Darwin" failure
    • The previous method of setting the SVE calling convention means that it has been possible to use the SVE CC on Darwin for a while now (and some tests depend on that).
    • It is unclear if this really should be allowed or not (but does pose a question for SME functions on Darwin that have SVE params)
  2. Removing a Windows SVE test that used varargs and SVE arguments
    • This is not supported (and already fails LowerCall)
  3. Updating the checks for a Windows + exceptions SVE test

Note: The updated logic intends to closely match the code in AArch64TargetLowering::LowerCall (which also changes the CC).


Full diff: https://github.com/llvm/llvm-project/pull/152742.diff

7 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp (+1-2)
  • (modified) llvm/lib/Target/AArch64/AArch64FrameLowering.cpp (+6-4)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+31-18)
  • (modified) llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h (-7)
  • (modified) llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp (+1-8)
  • (modified) llvm/test/CodeGen/AArch64/arm64-darwin-cc.ll (-2)
  • (modified) llvm/test/CodeGen/AArch64/win-sve.ll (+21-92)
diff --git a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
index c52487ab8a79a..f7080465b4880 100644
--- a/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
+++ b/llvm/lib/Target/AArch64/AArch64AsmPrinter.cpp
@@ -1356,8 +1356,7 @@ void AArch64AsmPrinter::emitFunctionEntryLabel() {
   if (TT.isOSBinFormatELF() &&
       (MF->getFunction().getCallingConv() == CallingConv::AArch64_VectorCall ||
        MF->getFunction().getCallingConv() ==
-           CallingConv::AArch64_SVE_VectorCall ||
-       MF->getInfo<AArch64FunctionInfo>()->isSVECC())) {
+           CallingConv::AArch64_SVE_VectorCall)) {
     auto *TS =
         static_cast<AArch64TargetStreamer *>(OutStreamer->getTargetStreamer());
     TS->emitDirectiveVariantPCS(CurrentFnSym);
diff --git a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
index 885f2a94f85f5..3d8d79de17517 100644
--- a/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp
@@ -339,10 +339,10 @@ static bool requiresSaveVG(const MachineFunction &MF);
 // on the stack. This function is safe to be called before callee-saves or
 // object offsets have been determined.
 static bool isLikelyToHaveSVEStack(MachineFunction &MF) {
-  auto *AFI = MF.getInfo<AArch64FunctionInfo>();
-  if (AFI->isSVECC())
+  if (MF.getFunction().getCallingConv() == CallingConv::AArch64_SVE_VectorCall)
     return true;
 
+  auto *AFI = MF.getInfo<AArch64FunctionInfo>();
   if (AFI->hasCalculatedStackSizeSVE())
     return bool(getSVEStackSize(MF));
 
@@ -3121,12 +3121,14 @@ static unsigned getPrologueDeath(MachineFunction &MF, unsigned Reg) {
 static bool produceCompactUnwindFrame(MachineFunction &MF) {
   const AArch64Subtarget &Subtarget = MF.getSubtarget<AArch64Subtarget>();
   AttributeList Attrs = MF.getFunction().getAttributes();
-  AArch64FunctionInfo *AFI = MF.getInfo<AArch64FunctionInfo>();
   return Subtarget.isTargetMachO() &&
          !(Subtarget.getTargetLowering()->supportSwiftError() &&
            Attrs.hasAttrSomewhere(Attribute::SwiftError)) &&
          MF.getFunction().getCallingConv() != CallingConv::SwiftTail &&
-         !requiresSaveVG(MF) && !AFI->isSVECC();
+         MF.getFunction().getCallingConv() !=
+             CallingConv::AArch64_SVE_VectorCall &&
+
+         !requiresSaveVG(MF);
 }
 
 static bool invalidateWindowsRegisterPairing(unsigned Reg1, unsigned Reg2,
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index bad7ccde2e1e0..a8565b899dcf0 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -24,6 +24,7 @@
 #include "llvm/ADT/APInt.h"
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/ScopeExit.h"
 #include "llvm/ADT/SmallSet.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/ADT/SmallVectorExtras.h"
@@ -7837,7 +7838,7 @@ SDValue AArch64TargetLowering::LowerFormalArguments(
     const SmallVectorImpl<ISD::InputArg> &Ins, const SDLoc &DL,
     SelectionDAG &DAG, SmallVectorImpl<SDValue> &InVals) const {
   MachineFunction &MF = DAG.getMachineFunction();
-  const Function &F = MF.getFunction();
+  Function &F = MF.getFunction();
   MachineFrameInfo &MFI = MF.getFrameInfo();
   bool IsWin64 =
       Subtarget->isCallingConvWin64(F.getCallingConv(), F.isVarArg());
@@ -7845,16 +7846,36 @@ SDValue AArch64TargetLowering::LowerFormalArguments(
                     (isVarArg && Subtarget->isWindowsArm64EC());
   AArch64FunctionInfo *FuncInfo = MF.getInfo<AArch64FunctionInfo>();
 
-  SmallVector<ISD::OutputArg, 4> Outs;
-  GetReturnInfo(CallConv, F.getReturnType(), F.getAttributes(), Outs,
-                DAG.getTargetLoweringInfo(), MF.getDataLayout());
-  if (any_of(Outs, [](ISD::OutputArg &Out){ return Out.VT.isScalableVector(); }))
-    FuncInfo->setIsSVECC(true);
-
   // Assign locations to all of the incoming arguments.
   SmallVector<CCValAssign, 16> ArgLocs;
   CCState CCInfo(CallConv, isVarArg, MF, ArgLocs, *DAG.getContext());
 
+  // This logic is consistent with AArch64TargetLowering::LowerCall.
+  // The `ShouldUpgradeToSVECC` flag can be when analyzing arguments.
+  bool ShouldUpgradeToSVECC = false;
+  auto _ = make_scope_exit([&] {
+    if (CallConv != CallingConv::C && CallConv != CallingConv::Fast)
+      return;
+
+    if (!ShouldUpgradeToSVECC) {
+      // If the flag was not set, check if the return value requires the SVE CC.
+      SmallVector<ISD::OutputArg, 4> Outs;
+      GetReturnInfo(CallConv, F.getReturnType(), F.getAttributes(), Outs,
+                    DAG.getTargetLoweringInfo(), MF.getDataLayout());
+      ShouldUpgradeToSVECC = any_of(
+          Outs, [](ISD::OutputArg &Out) { return Out.VT.isScalableVector(); });
+    }
+
+    if (!ShouldUpgradeToSVECC)
+      return;
+
+    if (isVarArg)
+      report_fatal_error("Passing/returning SVE types to variadic functions "
+                         "is currently not supported");
+
+    F.setCallingConv(CallingConv::AArch64_SVE_VectorCall);
+  });
+
   // At this point, Ins[].VT may already be promoted to i32. To correctly
   // handle passing i8 as i8 instead of i32 on stack, we pass in both i32 and
   // i8 to CC_AArch64_AAPCS with i32 being ValVT and i8 being LocVT.
@@ -7942,14 +7963,14 @@ SDValue AArch64TargetLowering::LowerFormalArguments(
         RC = &AArch64::FPR128RegClass;
       else if (RegVT.isScalableVector() &&
                RegVT.getVectorElementType() == MVT::i1) {
-        FuncInfo->setIsSVECC(true);
         RC = &AArch64::PPRRegClass;
+        ShouldUpgradeToSVECC = true;
       } else if (RegVT == MVT::aarch64svcount) {
-        FuncInfo->setIsSVECC(true);
         RC = &AArch64::PPRRegClass;
+        ShouldUpgradeToSVECC = true;
       } else if (RegVT.isScalableVector()) {
-        FuncInfo->setIsSVECC(true);
         RC = &AArch64::ZPRRegClass;
+        ShouldUpgradeToSVECC = true;
       } else
         llvm_unreachable("RegVT not supported by FORMAL_ARGUMENTS Lowering");
 
@@ -8597,14 +8618,6 @@ bool AArch64TargetLowering::isEligibleForTailCallOptimization(
       CallAttrs.caller().hasStreamingBody())
     return false;
 
-  // Functions using the C or Fast calling convention that have an SVE signature
-  // preserve more registers and should assume the SVE_VectorCall CC.
-  // The check for matching callee-saved regs will determine whether it is
-  // eligible for TCO.
-  if ((CallerCC == CallingConv::C || CallerCC == CallingConv::Fast) &&
-      MF.getInfo<AArch64FunctionInfo>()->isSVECC())
-    CallerCC = CallingConv::AArch64_SVE_VectorCall;
-
   bool CCMatch = CallerCC == CalleeCC;
 
   // When using the Windows calling convention on a non-windows OS, we want
diff --git a/llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h b/llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
index 800787cc0b4f5..dd7b97619f938 100644
--- a/llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
+++ b/llvm/lib/Target/AArch64/AArch64MachineFunctionInfo.h
@@ -209,10 +209,6 @@ class AArch64FunctionInfo final : public MachineFunctionInfo {
 
   bool IsMTETagged = false;
 
-  /// The function has Scalable Vector or Scalable Predicate register argument
-  /// or return type
-  bool IsSVECC = false;
-
   /// The frame-index for the TPIDR2 object used for lazy saves.
   TPIDR2Object TPIDR2;
 
@@ -280,9 +276,6 @@ class AArch64FunctionInfo final : public MachineFunctionInfo {
   int64_t getStreamingVGIdx() const { return StreamingVGIdx; };
   void setStreamingVGIdx(unsigned FrameIdx) { StreamingVGIdx = FrameIdx; };
 
-  bool isSVECC() const { return IsSVECC; };
-  void setIsSVECC(bool s) { IsSVECC = s; };
-
   TPIDR2Object &getTPIDR2Obj() { return TPIDR2; }
 
   void initializeBaseYamlFields(const yaml::AArch64FunctionInfo &YamlMFI);
diff --git a/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp b/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
index 77dfab83a834a..81d9e02d548f1 100644
--- a/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64RegisterInfo.cpp
@@ -104,8 +104,6 @@ AArch64RegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
     if (MF->getFunction().getCallingConv() ==
         CallingConv::AArch64_SVE_VectorCall)
       return CSR_Win_AArch64_SVE_AAPCS_SaveList;
-    if (MF->getInfo<AArch64FunctionInfo>()->isSVECC())
-      return CSR_Win_AArch64_SVE_AAPCS_SaveList;
     return CSR_Win_AArch64_AAPCS_SaveList;
   }
   if (MF->getFunction().getCallingConv() == CallingConv::AArch64_VectorCall)
@@ -148,8 +146,6 @@ AArch64RegisterInfo::getCalleeSavedRegs(const MachineFunction *MF) const {
     // This is for OSes other than Windows; Windows is a separate case further
     // above.
     return CSR_AArch64_AAPCS_X18_SaveList;
-  if (MF->getInfo<AArch64FunctionInfo>()->isSVECC())
-    return CSR_AArch64_SVE_AAPCS_SaveList;
   return CSR_AArch64_AAPCS_SaveList;
 }
 
@@ -165,8 +161,7 @@ AArch64RegisterInfo::getDarwinCalleeSavedRegs(const MachineFunction *MF) const {
   if (MF->getFunction().getCallingConv() == CallingConv::AArch64_VectorCall)
     return CSR_Darwin_AArch64_AAVPCS_SaveList;
   if (MF->getFunction().getCallingConv() == CallingConv::AArch64_SVE_VectorCall)
-    report_fatal_error(
-        "Calling convention SVE_VectorCall is unsupported on Darwin.");
+    return CSR_Darwin_AArch64_SVE_AAPCS_SaveList;
   if (MF->getFunction().getCallingConv() ==
           CallingConv::AArch64_SME_ABI_Support_Routines_PreserveMost_From_X0)
     report_fatal_error(
@@ -205,8 +200,6 @@ AArch64RegisterInfo::getDarwinCalleeSavedRegs(const MachineFunction *MF) const {
     return CSR_Darwin_AArch64_RT_AllRegs_SaveList;
   if (MF->getFunction().getCallingConv() == CallingConv::Win64)
     return CSR_Darwin_AArch64_AAPCS_Win64_SaveList;
-  if (MF->getInfo<AArch64FunctionInfo>()->isSVECC())
-    return CSR_Darwin_AArch64_SVE_AAPCS_SaveList;
   return CSR_Darwin_AArch64_AAPCS_SaveList;
 }
 
diff --git a/llvm/test/CodeGen/AArch64/arm64-darwin-cc.ll b/llvm/test/CodeGen/AArch64/arm64-darwin-cc.ll
index 9e18f778d1c46..9ec4bea520c53 100644
--- a/llvm/test/CodeGen/AArch64/arm64-darwin-cc.ll
+++ b/llvm/test/CodeGen/AArch64/arm64-darwin-cc.ll
@@ -1,9 +1,7 @@
 ; RUN: sed -e "s,CC,cfguard_checkcc,g" %s | not --crash llc -mtriple=arm64-apple-darwin -o - 2>&1 | FileCheck %s --check-prefix=CFGUARD
-; RUN: sed -e "s,CC,aarch64_sve_vector_pcs,g" %s | not --crash llc -mtriple=arm64-apple-darwin -o - 2>&1 | FileCheck %s --check-prefix=SVE_VECTOR_PCS
 
 define CC void @f0() {
   unreachable
 }
 
 ; CFGUARD: Calling convention CFGuard_Check is unsupported on Darwin.
-; SVE_VECTOR_PCS: Calling convention SVE_VectorCall is unsupported on Darwin.
diff --git a/llvm/test/CodeGen/AArch64/win-sve.ll b/llvm/test/CodeGen/AArch64/win-sve.ll
index 53ac9344175a3..6bf8e58b9de29 100644
--- a/llvm/test/CodeGen/AArch64/win-sve.ll
+++ b/llvm/test/CodeGen/AArch64/win-sve.ll
@@ -784,8 +784,6 @@ define void @f6(<vscale x 2 x i64> %x, [8 x i64] %pad, i64 %n9) personality ptr
 ; CHECK-NEXT:  .seh_proc f6
 ; CHECK-NEXT:    .seh_handler __CxxFrameHandler3, @unwind, @except
 ; CHECK-NEXT:  // %bb.0:
-; CHECK-NEXT:    sub sp, sp, #16
-; CHECK-NEXT:    .seh_stackalloc 16
 ; CHECK-NEXT:    addvl sp, sp, #-18
 ; CHECK-NEXT:    .seh_allocz 18
 ; CHECK-NEXT:    str p4, [sp] // 2-byte Folded Spill
@@ -853,21 +851,21 @@ define void @f6(<vscale x 2 x i64> %x, [8 x i64] %pad, i64 %n9) personality ptr
 ; CHECK-NEXT:    add x29, sp, #16
 ; CHECK-NEXT:    .seh_add_fp 16
 ; CHECK-NEXT:    .seh_endprologue
-; CHECK-NEXT:    sub sp, sp, #64
+; CHECK-NEXT:    sub sp, sp, #80
 ; CHECK-NEXT:    mov x0, #-2 // =0xfffffffffffffffe
 ; CHECK-NEXT:    addvl x8, x29, #18
 ; CHECK-NEXT:    mov x19, sp
-; CHECK-NEXT:    stur x0, [x8, #16]
+; CHECK-NEXT:    stur x0, [x8]
 ; CHECK-NEXT:    addvl x8, x29, #18
-; CHECK-NEXT:    ldr x1, [x8, #32]
-; CHECK-NEXT:  .Ltmp0:
+; CHECK-NEXT:    ldr x1, [x8, #16]
+; CHECK-NEXT:  .Ltmp0: // EH_LABEL
 ; CHECK-NEXT:    add x0, x19, #0
 ; CHECK-NEXT:    bl g6
-; CHECK-NEXT:  .Ltmp1:
+; CHECK-NEXT:  .Ltmp1: // EH_LABEL
 ; CHECK-NEXT:  // %bb.1: // %invoke.cont
 ; CHECK-NEXT:    .seh_startepilogue
-; CHECK-NEXT:    add sp, sp, #64
-; CHECK-NEXT:    .seh_stackalloc 64
+; CHECK-NEXT:    add sp, sp, #80
+; CHECK-NEXT:    .seh_stackalloc 80
 ; CHECK-NEXT:    ldp x29, x30, [sp, #16] // 16-byte Folded Reload
 ; CHECK-NEXT:    .seh_save_fplr 16
 ; CHECK-NEXT:    ldr x28, [sp, #8] // 8-byte Folded Reload
@@ -932,12 +930,8 @@ define void @f6(<vscale x 2 x i64> %x, [8 x i64] %pad, i64 %n9) personality ptr
 ; CHECK-NEXT:    .seh_save_preg p14, 10
 ; CHECK-NEXT:    ldr p15, [sp, #11, mul vl] // 2-byte Folded Reload
 ; CHECK-NEXT:    .seh_save_preg p15, 11
-; CHECK-NEXT:    add sp, sp, #16
-; CHECK-NEXT:    .seh_stackalloc 16
 ; CHECK-NEXT:    addvl sp, sp, #18
 ; CHECK-NEXT:    .seh_allocz 18
-; CHECK-NEXT:    add sp, sp, #16
-; CHECK-NEXT:    .seh_stackalloc 16
 ; CHECK-NEXT:    .seh_endepilogue
 ; CHECK-NEXT:    ret
 ; CHECK-NEXT:    .seh_endfunclet
@@ -1160,64 +1154,6 @@ define void @f8(<vscale x 2 x i64> %v) {
   ret void
 }
 
-define void @f9(<vscale x 2 x i64> %v, ...) {
-; CHECK-LABEL: f9:
-; CHECK:       .seh_proc f9
-; CHECK-NEXT:  // %bb.0:
-; CHECK-NEXT:    sub sp, sp, #64
-; CHECK-NEXT:    .seh_stackalloc 64
-; CHECK-NEXT:    addvl sp, sp, #-1
-; CHECK-NEXT:    .seh_allocz 1
-; CHECK-NEXT:    str z8, [sp] // 16-byte Folded Spill
-; CHECK-NEXT:    .seh_save_zreg z8, 0
-; CHECK-NEXT:    str x30, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .seh_save_reg_x x30, 16
-; CHECK-NEXT:    .seh_endprologue
-; CHECK-NEXT:    addvl x8, sp, #1
-; CHECK-NEXT:    add x9, sp, #8
-; CHECK-NEXT:    str x2, [x8, #32]
-; CHECK-NEXT:    addvl x8, sp, #1
-; CHECK-NEXT:    str x0, [x8, #16]
-; CHECK-NEXT:    addvl x8, sp, #1
-; CHECK-NEXT:    str x1, [x8, #24]
-; CHECK-NEXT:    addvl x8, sp, #1
-; CHECK-NEXT:    str x3, [x8, #40]
-; CHECK-NEXT:    addvl x8, sp, #1
-; CHECK-NEXT:    str x4, [x8, #48]
-; CHECK-NEXT:    addvl x8, sp, #1
-; CHECK-NEXT:    str x5, [x8, #56]
-; CHECK-NEXT:    addvl x8, sp, #1
-; CHECK-NEXT:    str x6, [x8, #64]
-; CHECK-NEXT:    addvl x8, sp, #1
-; CHECK-NEXT:    str x7, [x8, #72]
-; CHECK-NEXT:    add x8, sp, #16
-; CHECK-NEXT:    addvl x8, x8, #1
-; CHECK-NEXT:    str x8, [sp, #8]
-; CHECK-NEXT:    //APP
-; CHECK-NEXT:    //NO_APP
-; CHECK-NEXT:    .seh_startepilogue
-; CHECK-NEXT:    ldr x30, [sp] // 8-byte Folded Reload
-; CHECK-NEXT:    .seh_save_reg x30, 0
-; CHECK-NEXT:    add sp, sp, #16
-; CHECK-NEXT:    .seh_stackalloc 16
-; CHECK-NEXT:    ldr z8, [sp] // 16-byte Folded Reload
-; CHECK-NEXT:    .seh_save_zreg z8, 0
-; CHECK-NEXT:    add sp, sp, #64
-; CHECK-NEXT:    .seh_stackalloc 64
-; CHECK-NEXT:    addvl sp, sp, #1
-; CHECK-NEXT:    .seh_allocz 1
-; CHECK-NEXT:    add sp, sp, #64
-; CHECK-NEXT:    .seh_stackalloc 64
-; CHECK-NEXT:    .seh_endepilogue
-; CHECK-NEXT:    ret
-; CHECK-NEXT:    .seh_endfunclet
-; CHECK-NEXT:    .seh_endproc
-  %va_list = alloca ptr
-  call void @llvm.va_start.p0(ptr %va_list)
-  call void asm "", "r,~{d8},~{memory}"(ptr %va_list)
-  ret void
-}
-
 declare void @g10(ptr,ptr)
 define void @f10(i64 %n, <vscale x 2 x i64> %x) "frame-pointer"="all" {
 ; CHECK-LABEL: f10:
@@ -1546,40 +1482,33 @@ define tailcc void @f15(double %d, <vscale x 4 x i32> %vs, [9 x i64], i32 %i) {
 ; CHECK-LABEL: f15:
 ; CHECK:       .seh_proc f15
 ; CHECK-NEXT:  // %bb.0:
-; CHECK-NEXT:    addvl sp, sp, #-1
-; CHECK-NEXT:    .seh_allocz 1
-; CHECK-NEXT:    str z8, [sp] // 16-byte Folded Spill
-; CHECK-NEXT:    .seh_save_zreg z8, 0
-; CHECK-NEXT:    str x28, [sp, #-16]! // 8-byte Folded Spill
-; CHECK-NEXT:    .seh_save_reg_x x28, 16
+; CHECK-NEXT:    str x28, [sp, #-32]! // 8-byte Folded Spill
+; CHECK-NEXT:    .seh_save_reg_x x28, 32
 ; CHECK-NEXT:    str x30, [sp, #8] // 8-byte Folded Spill
 ; CHECK-NEXT:    .seh_save_reg x30, 8
-; CHECK-NEXT:    sub sp, sp, #16
-; CHECK-NEXT:    .seh_stackalloc 16
+; CHECK-NEXT:    str d8, [sp, #16] // 8-byte Folded Spill
+; CHECK-NEXT:    .seh_save_freg d8, 16
 ; CHECK-NEXT:    addvl sp, sp, #-1
 ; CHECK-NEXT:    .seh_allocz 1
 ; CHECK-NEXT:    .seh_endprologue
-; CHECK-NEXT:    addvl x8, sp, #2
+; CHECK-NEXT:    addvl x8, sp, #1
+; CHECK-NEXT:    addvl x9, sp, #1
 ; CHECK-NEXT:    //APP
 ; CHECK-NEXT:    //NO_APP
-; CHECK-NEXT:    stp d0, d0, [sp, #8]
 ; CHECK-NEXT:    ldr w8, [x8, #104]
-; CHECK-NEXT:    str w8, [sp, #8]
+; CHECK-NEXT:    str d0, [x9, #24]
+; CHECK-NEXT:    addvl x9, sp, #1
+; CHECK-NEXT:    str d0, [sp]
+; CHECK-NEXT:    str w8, [x9, #24]
 ; CHECK-NEXT:    .seh_startepilogue
 ; CHECK-NEXT:    addvl sp, sp, #1
 ; CHECK-NEXT:    .seh_allocz 1
-; CHECK-NEXT:    add sp, sp, #16
-; CHECK-NEXT:    .seh_stackalloc 16
+; CHECK-NEXT:    ldr d8, [sp, #16] // 8-byte Folded Reload
+; CHECK-NEXT:    .seh_save_freg d8, 16
 ; CHECK-NEXT:    ldr x30, [sp, #8] // 8-byte Folded Reload
 ; CHECK-NEXT:    .seh_save_reg x30, 8
-; CHECK-NEXT:    ldr x28, [sp] // 8-byte Folded Reload
-; CHECK-NEXT:    .seh_save_reg x28, 0
-; CHECK-NEXT:    add sp, sp, #16
-; CHECK-NEXT:    .seh_stackalloc 16
-; CHECK-NEXT:    ldr z8, [sp] // 16-byte Folded Reload
-; CHECK-NEXT:    .seh_save_zreg z8, 0
-; CHECK-NEXT:    addvl sp, sp, #1
-; CHECK-NEXT:    .seh_allocz 1
+; CHECK-NEXT:    ldr x28, [sp], #32 // 8-byte Folded Reload
+; CHECK-NEXT:    .seh_save_reg_x x28, 32
 ; CHECK-NEXT:    add sp, sp, #80
 ; CHECK-NEXT:    .seh_stackalloc 80
 ; CHECK-NEXT:    .seh_endepilogue

@@ -784,8 +784,6 @@ define void @f6(<vscale x 2 x i64> %x, [8 x i64] %pad, i64 %n9) personality ptr
; CHECK-NEXT: .seh_proc f6
; CHECK-NEXT: .seh_handler __CxxFrameHandler3, @unwind, @except
; CHECK-NEXT: // %bb.0:
; CHECK-NEXT: sub sp, sp, #16
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know enough about SEH to tell if this change is significant (to me, it seems insignificant). I also don't know if this case is fully supported.

@@ -1546,40 +1482,33 @@ define tailcc void @f15(double %d, <vscale x 4 x i32> %vs, [9 x i64], i32 %i) {
; CHECK-LABEL: f15:
; CHECK: .seh_proc f15
; CHECK-NEXT: // %bb.0:
; CHECK-NEXT: addvl sp, sp, #-1
Copy link
Member Author

@MacDue MacDue Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is due to the explicit tailcc on the function. I'm not 100% sure on the correct behaviour here, LowerCall only uses the SVE CC if the original CC is C or Fast (which the new logic matches). https://reviews.llvm.org/D84869 suggests this is not a safe CC for SVE.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LangRef says that the calling convention requires the prototype of all callees to exactly match the prototype of the function definition. So I guess adding tailcall on this function is not correct.

!requiresSaveVG(MF) && !AFI->isSVECC();
MF.getFunction().getCallingConv() !=
CallingConv::AArch64_SVE_VectorCall &&

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: whitespace

Comment on lines +7872 to +7874
if (isVarArg)
report_fatal_error("Passing/returning SVE types to variadic functions "
"is currently not supported");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can pass them to variadic functions, but not as variadic arguments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I was confused by the wording of our current errors which say "Passing SVE types to variadic functions currently not supported" (which it appears to check it use is as a variadic argument).

@@ -1546,40 +1482,33 @@ define tailcc void @f15(double %d, <vscale x 4 x i32> %vs, [9 x i64], i32 %i) {
; CHECK-LABEL: f15:
; CHECK: .seh_proc f15
; CHECK-NEXT: // %bb.0:
; CHECK-NEXT: addvl sp, sp, #-1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LangRef says that the calling convention requires the prototype of all callees to exactly match the prototype of the function definition. So I guess adding tailcall on this function is not correct.

@MacDue MacDue marked this pull request as draft August 8, 2025 16:52
@MacDue
Copy link
Member Author

MacDue commented Aug 8, 2025

Going to close this PR, while I think currently this is valid, I think the main confusing thing about isSVECC is the name (since it implies it changes/is the calling convention), whereas what it means is that we augment the calling convention if we have SVE args/results. Having this as a separate bit of state would be needed if we were ever to support "preserve_allcc" (and others) with SVE, though currently these less well-tested combinations are broken, e.g., see: https://godbolt.org/z/MdbsMMxeh

@MacDue MacDue closed this Aug 8, 2025
@paulwalker-arm
Copy link
Collaborator

paulwalker-arm commented Aug 8, 2025

I just wanted to add some extra commentary in case somebody comes a looking:

It's a nuisance but there is no SVE calling convention. There is the aarch64_sve_vector_pcs calling convention, which improves the performance of calling utility functions in the middle of SVE code, but otherwise there is no SVE calling convention.

What we did when adding SVE support was to modify the existing calling convention in a compatible way to document the requirement for handling state that was previously not available. This generally applies to all the calling conventions, for example, preserve_all should preserve SVE state if in use. I don't want to get into a situation where we need SVE versions of all the existing calling conventions. isSVECC() exists to distinguish those functions that have SVE state where the existing calling convention's new rules must be applied.

That said, I'm not going to fully defend the implementation as we certainly made use of "calling convention" mechanics in order to achieve the desired behaviour. If anything, I'd rather generalise the logic and only use "SVE calling convention" for functions marked with aarch64_sve_vector_pcs, but this is an implementation detail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants